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Second-Order Region for Gray-Wyner Network 


Shun Watanabe 


Abstract 

The coding problem over the Gray-Wyner network is studied from the second-order coding rates perspective. A 
tilted information density for this network is introduced in the spirit of Kostina-Verdu, and, under a certain regularity 
condition, the second-order region is characterized in terms of the variance of this tilted information density and the 
tangent vector of the first-order region. The second-order region is proved by the type method: the achievability part 
is proved by the type-covering argument, and the converse part is proved by a refinement of the perturbation approach 
that was used by Gu-Effros to show the strong converse of the Gray-Wyner network. This is the first instance that the 
second-order region is characterized for a multi-terminal problem where the characterization of the first-order region 
involves an auxiliary random variable. 


I. Introduction 

We study the coding problem over the Gray-Wyner network ED from the second-order coding rates perspective. 
The study of the second-order coding rates has attracted significant interest in recent years since it gives a good 
approximation for the finite blocklength performance of certain coding systems ED, El. The second-order coding 
rates for point-to-point systems are quite well-understood ll29l . flOl . 651 . (9) , fl9l . 65) . IT6l . Ifl3l . fl7l . 651 . 1T8I . 
[36;|. On the other hand, the extension of the second-order analysis to multi-terminal problems is rather immature; 
some problems are solved completely Oil . 64) , (32), 601 . 68) . ffTTl . 64) , 651 , but only achievability bounds are 
known for other problems OD . 6D , IT2) . l27l . 67) . 69) , (26) . See l30l for further review of existing results on 
the second-order analysis. 

The Gray-Wyner network is described in Fig. Q] The network consists of one encoder and two decoders. The 
encoder and both the decoders are connected by the common channel, and each decoder is also connected to the 
encoder by its own private channel. Then, the goal for each decoder is to almost losslessly reproduce one part 
of correlated sources, and we are interested in the optimal trade-off among the rates of the three channels. The 
information theoretic chanracterization of achievable rate triplets was derived in HD, and, as is typical for multi¬ 
terminal problems (cf. 0), it involves an auxiliary random variable, which makes the second-order analysis of this 
problem non-trivial. 
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A. Contributions 

We characterize the second-order region of the Gray-Wyner network under a certain regularity condition^ For 
that purpose, we introduce a tilted information density for this network in the spirit of Kostina-Verdu H6l . Then, the 
second-order region is characterized in terms of the variance of this tilted information density and the tangent vector 
of the first-order region. Since the first-order region of the Gray-Wyner network is characterized by an auxiliary 
random variable, the tilted information density is defined by using that auxiliary random variable. In general, there 
is no guarantee that an optimal test channel is unique, and more than one optimal test channel may exist. However, 
we show that the tilted information density is uniquely defined irrespective of the choice of optimal test channels. 
Also, we show some other properties of the tilted information density. 

In 0, the plane where the sum of the three rates coincide with the joint entropy of the correlated sources was 
called the Pangloss plane, and it gained a special attention since there is no sum-rate loss compared to cooperative 
decoding schemes on this plane. When the first-order rates are on the Pangloss plane, as an illustration of our main 
result, we show a simple expression of the second-order region. Interestingly, the sum constraint of the second-order 
rates coincide with that can be achieved by cooperative decoding schemes; this means that there is no sum-rate 
loss compared to cooperative decoding schemes even up to the second-order. 

In the proof of the second-order region, we use the type method. The achievability part is proved by an application 
of the type covering argument (cf. fl40l and (4] Chapter 9]). For the converse part, we refine the perturbation approach 
that was used by Gu-Effros J7]|, |8j| to show the strong converse of the Gray-Wyner network. By these argument, 
we first derive an upper bound and a lower bound on the error probability in terms of a probability of a certain 
function of the joint type. Then, we approximate that probability by using the central limit theorem. 

When we use the type method for the second-order analysis, say the rate-distortion problem, we take a derivative 
of the rate-distortion function with respect to the source distribution, and the second-order rate is characterized in 
terms of the variance of that derivative (cf. Hjl '). Then, we can show that that characterization coincides with the 
variance of the d-tilted information introduced in ED. In this paper, we consider a slightly different argument. 
When we take a derivative of a certain function of a distribution, we have to extend the domain of that function 
to the outside of the probability simplex (cf. l22l Appendix A]). In order to circumvent such an extension, we 
consider a different parameterization of the probability simplex, which is often used in information geometry ID. 
Then, we take a derivative of the function with respect to that parameter. Also, instead of introducing the variance 
of the derivative, we directly characterize the second-order region in terms of the variance of the tilted information 
density. 

B. Paper Organization 

The rest of the paper is organised as follows: In Section [II] we introduce our notation, and recall the problem 
formulation of the Gray-Wyner network. In Section [Till we introduce the tilted information density for the Gray- 

1 Because of the regularity condition, our result cannot be applied to singular points on the boundary of the first-order region, i.e., the boundary 
points where the first-order region cannot be differentiated. 
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Fig. 1. A description of the Gray-Wyner network. 

Wyner network, and investigate its properties. Then, in Section HV1 we show our second-order coding theorem and 
its proof. In Section [V] we further investigate the Pangloss plane. We conclude the paper with some discussions in 
Section [VT] 


II. Problem Formulation 

In this section, we introduce our notations and recall the Gray-Wyner network (6j . 

A. Notations 

Random variables (e.g. X ) and their realizations (e.g. x) are in capital and lower case, respectively. All random 
variables take values in some finite alphabets which are denoted in calligraphic font (e.g. X). The cardinality of X 
is denoted as \X\. Let the random vector X n = (X-[ ,... , X n ) and similarly for a realization x = {x -\...., x n ). For 
information theoretic quantities, we follows the same notations as |4|; e.g. the entropy and the mutual information 

are denoted by H{X) and /(X A Y), respectively. Also, the expectation and the variance are denoted by E[-] and 

2 

V[-] respectively. Q(£) = J t °° du is the upper tail probability of the standard normal distribution; its inverse 

is denoted by Q _1 (e) for 0 < e < 1. 

The set of all distribution on X is denoted by V(X). The set of all channels from X to y is denoted by V(y\X). 
We will also use the method of types 0. For a given sequence x, its type is denoted by t :r: . The set of all types on 
X is denoted by V n (X), and the set of all conditional types is denoted by V n {y\X). For a given type P x £ V n {X), 
the set of all sequences with type [ } x is denoted by TJ'. For a given joint type P x Y and a sequence x £ the 
set of all sequences whose joint type with x is P xy is denoted by 7’S x (x). For type P x and joint type P XY , we 
use notations H{X) and NX A Y), where the random variables are distributed according to those type and joint 
type. 

For a given distribution Px, its support is denoted by supp(P.Y). In latter sections, we will differentiate a 
certain function of distributions around a given joint distribution Pxy, which may not have full support. For that 
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purpose, it is convenient to introduce a parametrization for distribution P that has the same support as I'x y O Let 
m = supp(Pxr); without loss of generality, we assign 1 through m to elements in supp(Pxy). Then, parameter 
6(P) £ M m_1 is defined as 6i = P(i) for i = 1,... ,m — 1; apparently it holds P(m) = 1 — The 

distribution corresponding to parameter 9 is denoted by Pg. 

B. Gray-Wyner Network 

In this section, we recall the lossless source coding problem over the Gray-Wyner network (see Fig. |T}. Let us 
consider a correlated source ( X, Y ) taking values in X x y and having joint distribution Pxy■ We consider a block 
coding of length n. A coding system consists of three encoders 

: X n xy n ->M ( 0 n) , (1) 

tp[ n) :X n xy n ( 2 ) 

<p ( 2 n) : X n x y n ->■ M ( ^\ (3) 

and two decoders 


4 n) 


X M[ n) 

X n , 

(4) 

v4 n) 


xA4 n) 

-► y n . 

(5) 


(n) 

The message encoded by ip q ’ is sent over the common channel, and received by both the decoders; the message 

in) . 

encoded by ip\ is sent over the private channel to ith decoder, where i = 1,2. The first decoder is required to 
reproduce X n almost losslessly, while the second decoder is required to reproduce Y n almost losslessly. In the 
following, we omit the blocklength n when it is obvious from the context. For (X . Y n ) ~ P, the error probability 
of code = (ipo, ipi, ip 2 , ipi, 1 P 2 ) is defined as 

P e (<h„|P) := Pr (MMX", Y n ), Y n )), Y n ), ip 2 (X n , Y n ))) ? (X n : F")j . (6) 

Then, the correct probability of the code is defined as 

P C ($n|P):=l-Pe($n|P). (7) 

In the following, we are particularly interested in the case where P is a product distribution PjJ y , i.e., (X n . Y" ) 
is an i.i.d. sequence. 

Definition 1 (First-Order Region) The rate triplet (ro,ri,r 2 ) £ R'j is defined to be achievable if there exists a 

2 In the literature 03 (see also 1221 ), the probability simplex is embedded into the Euclidian space, and the parameterization on that Euclidian 
space is used. However in this paper, we regard the probability simplex as a manifold (cf. [lj), and we consider a parameterization that is 
different from the literature so that we do not have to extend the domain of a certain function to outside the probability simplex. 
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sequence of code {‘f’nj-JJL-L such that 

limsup — log | < r Q , 

n—> oo Tl 

lim sup — log \M ^ | < n, 

n—> oo Tl 

limsup — log \M ^| < r* 2 , 

n—>oo Tl 

and 


(8) 

(9) 

( 10 ) 


lim P e ($„|P£ r ) = 0. (11) 

n—too 

Then, the achievable region 1Zg\i(Pxy) is defined as the set of all achievable rate triplets. 

The first-order region TZqv(Pxy) is characterized in [J6) . Let 7 Zq V (Pxy) be the set of all rate triplets (ro, ri,^) 
such that there exists a test channel P\y\XY with |W| < |.V||V| + 2 such that 


r 0 >I(WAX,Y), 

(12) 

n > H(X\W), 

03) 

r 2 > H(Y\W). 

(14) 


Proposition 1 (@) It holds thajj 

Kgu(Pxy) = Kv(Pxy). (15) 


In this paper, we are interested in the second-order region. We follow the second-order formulation in li24l . 


Definition 2 (Second-Order Region) For a boundary point (t*q, »”*, 7 * 2 ) of IZgu(Pxy) and 0 < e < 1, the rate 
triplet (Lq, Li, L 2 ) £ R 3 is defined to be (e, rjjj, , r 2 )-achievable if there exists a sequence of code such 


that 


and 


lim sup 

n—> oo 

l 0 g|A4^|-nrS 

< L 

y/n 

lim sup 

n—> oo 

log|7Wi n) | -nr\ 

< L 

y/n 

lim sup 

n—>oo 

log\M { ^\-nr* 

< L 

y/n 


06) 

(17) 

08) 


limsupP e ($n|R£r) < e - (19) 

n —> 00 

Then, the (e, , r J , r| )-achievable region £gw(e; r$, r\, rj) is defined as the set of all (e, r$,r*, rj)-achievable rate 
triplets. 


3 In fact, the cardinality bound was not shown in (6), but it can be proved by the support lemma (4) (see also 03)- 
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In contrast to first-order rates, second-order rates may be negative even though they are conventionally called 
“rates”. 


III. Tilted Information Density 

In this section, we introduce the tilted information density for the Gray-Wyner network in the spirit of Ifl6l . The 
tilted information density plays an important role to characterize the second-order region £ G w(e; ?o, r*, r%) in the 
next section. 

Given rr, r 2 > 0, let 

R(ri,r 2 \PxY) ■= min {r 0 : (r 0 ,ri,r 2 ) £ R* w {Pxy)} (20) 

= min {l{W A X, Y) : |W| < \X\\y\ +2,n > H(X\W),r 2 > H(Y\W)}. (21) 

Since Rq\i{Pxy) is a convex region, an optimal test channel satisfies the conditions rr > H(X\W) and r 2 > 
H(Y\W) with equality unless R(ri, r 2 \Pxy) = 0. 

Throughout the paper, we assume that 7 Zq V (Pxy) is smooth at a boundary point (r ( j. r *. r 2 ) of our interest^ i.e., 


K = K(Pxy) ■= ~-g^-R(ri,r 2 \PxY) 


( 22 ) 


is well defined for i = 1,2, where r* = (r*,r 2 ). Note that A* > 0. In the following, we assume that they are 
strictly positive. In other words, we consider a boundary point such that r q > 0. 

For given P w \xy £ V{W\X x J 7 ), P\v G P(W), Pjc\w e P(X\W), and Py\w e ^(^IW), we introduce the 
following function: 


P{Pw\xy j Pw > Rji\w' ^V|vk) 

:= D(P w \xy\\Pw\Pxy) + A^E 


log 


1 


Px\wW W ) 


A5E 


log 


1 


Px\wi X \W) 


= I{W A X,Y) + D{P W \\P W ) + \\{H(X\W) + D(P xlw \\P m \Pw) - r \} 
+ K{H{Y\W) + D(P Y{w \\P m \P w ) -r* 2 }. 

From the second expression, we can find that the following holds: 

R‘{r \; | Pxy ) = min min min min F(P w \xy,Pw, P x\W’ P y\w)- 

r’x\ W ^Y\W ^W\XY 

For given P^, P^, Py\w’ Al > °> and ^2 > 0, let 

A(a:, y\P^y, PgPy | W’ ^ 2 ) 


:= - log E 


exp Ai K—log 


p m ^\w) 


+ A 2 r -2 - log ■ 


D Y\w(y\w) 


(23) 

(24) 

(25) 

(26) 

(27) 

(28) 
(29) 


where each term exp{- • • } in the expectation is understood as 0 if either Px\w(x\w) = 0 or Py^{y\w) = 0, and 
the expectation is taken with respect to W ~ P^. 


4 The region 7£ GW ( P\ Y) has some singular points in general, and the following analysis does not apply for those singular points. 
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The following lemma gives a connection between the two functions F(- • ■) and A(- • •). 


Lemma 1 For any l’ w , Ty y, and P y\ W’ we have 

min F(Pwjxy, p w, p x\w' p y\w) = ^[A(-V, Y|P^p, A 1; A 2 )], ( 30 ) 

"w\XY 11 11 

where the minimization is uniquely achieved by P\v\xy such that 

Pw\xy(w\x,v) (31) 

+ ^( r 2 - log- 


= p w( w ) exp<{ A(x,y\P w ,Px\w’ P v\W’ A i> A 2 ) + ^i( r i - log 


p x\w( x \ w ) 


p r\w(y \ w ) 


and p w\XY{w\x,y) = 0 whenever either p x\wi x \w) = 0 or P Y\w(y\ w ) = 0. 


0 


(32) 


Proof: Without loss of optimality, we can assume that Pw\xy{w\x, y) = 0 whenever either Pj~^(x\w) = 0 or 
P Y\w(y\ w ) = 0- Otherwise, the value of F{P w \xyi p w, P x\w> P y\w) is infinite. By using the log-sum inequality 
(cf. SI), we have 


F( P W\XY j p w i P x | w - p y\w) 

V- „ / \ n / I M P W\XY{w\x,y) 

= V p XY[x,y) p w\XY{w\x,y)\og - ' , , - 

^ p w(w) 


x,y,w 


+ A 1 p XY{x 1 y)P w \xY{w\x,y)\ log 


x,y,w 


p x\w( x M 1 

+ A 2 P XY(x,y)P w \xY(w\x,y)l log- -y-j—T — 1~2 

x, y ,w i ^Y\wyy\ w > 

= P XY{x,y)P W \XY(.w\ x iy)l°g 


p w\xy ( w\x,y) 


P w (w) exp {AJ (ri - log p—+ AJ(r 2 * - log P \ y]w) )} 


— E[A(X, Y\P W , p x\w, p y\w > A i> A 2 )]^ 


(33) 

(34) 

(35) 

(36) 

(37) 

(38) 


where the equality holds if and only if Pw\xy is given by (l32l >. ■ 

Let P w\xy Be an optimal test channel that achieve R(rl,r%\P xy), and let Pw*, p x\w*> and p v\w* be 
corresponding output distribution and conditional distributions, respectively. Then, note that 


p ( r *■, r *i\ p XY ) = min min min min F{P w \ X y, p wi P x\W' P y\w) l 39 ) 

^X\W ^Y\W ^W\XY 

< min F(P w \xy, p w*, p x\w*, p y\w*) (40) 

^W\XY 

— F(Pw\xy> Pw* j Px\w* j Py\w*) (41) 

= R{r\y 2 \P X Y)- (42) 


5 The only exceptional case is where either Pj^^{x\w) = 0 or P^^(y\w) = 0 for every w E supp(P^). We will not invoke this lemma 
for such an exceptional case throughout the paper. 
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This implies that Pw\xy achieves the minimization in ( l40b . Thus, by Lemma Q] I'w\xy must satisfy 


W\XY 

p w\XY( w \ x ,y) = Pw*(w)exp ^A(x,y\Pw*,Px\w*:Py\w*, A*, A£) 


+ A{ rj-log 


1 


Px\w*{x\w) 


+ A 2 * ( r* 2 - log 


Py\w*{y\w) J y 


which is equivalent to 


A(x, y\Pw *, Px\w* , Py\w* , A*, A£) 

= log-^——-+ A} log —-—+ A 2 log --- r. 


P x \w*{x\w) 


Py\w* {y\w) 


Pw * (w) 

for w G supp(P^ |xy (-|a;, 2 /)). 

Although optimal test channels may not be unique^ we have the following important property. 


(43) 

(44) 

(45) 

(46) 


Lemma 2 Let P^^xy ar *d P\y 2 \xy be optimal test channels, and let Pw*, Px\w > and P Y \w* for * = 1,2 be 
corresponding output distribution and conditional distributions. Then, we have 


A(x, y\Pw *, Px\w* i Py\wf ? A}, A£) = A(x, y\Pw*, Px\w* > Py\w* > A}, A£). 


(47) 


Proof: Let T = {1,2} be time-sharing alphabet, and let R(r\. rf Pxv ) be defined in the same manner as 

Ii(r \. r' 2 1 Pxy ) where the auxiliary alphabet is extended from W to T x W. In fact, by the support lemma (cf. PI), 

this extension does not change the value, i.e., 

R(ri,r 2 \Px Y ) = R{ri,r 2 \Px Y )- (48) 

We also note that, for ai,ct 2 > 0 with ai + «2 = 1, a test channel Pf w \x Y defined by 

Ptw\xy w \ x > v) := a tPw t \x Y ( w \ x i y) (49) 

is an optimal test channel for rJjPxy)- Thus, we can apply the same argument that leads to ( l46l >. and we 


conclude that the value of 


, Prw\XY^^ w \ x ^y) , 

log 77 - 7 T—N + A} log ■ 


1 


-rI +A$ log 


1 


(50) 


P T *w* {t, w) P x \T*w*( x \t,w) V ' Py\T*w*(y\t,w) 2 , 

does not depend on (t,w) G supp(P^ M ,, xy (-|x, y)), where Pt*w*> Px\t*w*> and P y \t*w* are output dis¬ 
tribution and conditional distributions induced by Pf w \xv This together with ( l49t imply thatj^ for any w G 
suvv(P Wl \x Y (-\ x ’y)) an d w € SU PP (Pw 2 \XY i'\ X i y)\ 

Pf,,w{w\x,y) _ /. 1 1 


log 


W!\XY\ 


Pw * (w) 


+ A} log 


P X \wi( x \w) 


— r i + A* log 


PY\wi(y\w) 


(51) 


= log- 


KM) / 1 \ +A T,o 1 -A (52) 

Pw*(w') V Px\wi\ x W) ) \ -fV|w 2 *U/K) ) 


6 In fact, since the auxiliary alphabet does not have any semantic meaning, for a given optimal test channel, we can always produce another 
optimal test channel by permuting symbols in W. 

7 Also note that supp(P* w|xi ,(-|x,y)) = {1} X supp(P^ i |xy (-|z, y)) U {2} X supp(P^ 2|xy (-|x, y)) 
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Thus, we have ©. ■ 

Because of Lemma[2] the following tilted information density is well-defined irrespective of the choice of optimal 
test channels^ 

Definition 3 Let P^n XY be an optimal test channel, and let Pw*, Px\w*> and Py\w* be corresponding output 
distribution and conditional distributions, respectively. Then, the tilted information density for Gray-Wyner network 
(with respect to r 0 -axis) is defined by 


Jxy(x,v ) = 3xY{x,y\rl,rl) := A(x,y\P w *,P x \w*, p Y\w*,K, A£). 


From ( l46t . we have 

Jxy[x, y) = log 


W\XY 


w\x,y) 


+ At log 


1 


p w *(w) V Px\w*(x\w) 

for every w € supp(P^| Xy . (-|a:, t/))Jj and thus 

E[ JxY (X,Y)\=R(r* 1 ,r* 2 \P XY ). 


~rt + Ag ( log 


p Y \ W *{y\w) 


(53) 


(54) 


(55) 


Now we consider to differentiate R(r*,r£\Pg) with respect to 9 around £ = 0 (P X y) (cf. Section lll-AI for 
the parametric notation). For a technical reason, we assume that there exists a set of optimal test channels 
{-f , wix 9 ifl}eeA/'(G around a neighbour Af(£) of £ such that Pw\x e Y g is differentiable^ Also, we take Af(£) 
sufficiently small so that 


supp(P^| Xs> - 5 (-|x,y)) C snpp(P^ lXgYg {-\x,y)) 

for every x,y. The following lemma can be proved in a similar manner as lfl5l Theorem 2.2], 


(56) 


Lemma 3 Let £ = 0(P X y)- Then, we have 

dR(rt,rZ\Pg) 


dOi 


= JXY(i) - JXY(m). 




Proof: Let (X$, Yg) ~ Pg. Since we can write 

R{r{,r* 2 \Pg) = ^2Pg{x,y)jx e Y 9 (x,y), 


(57) 


(58) 




8 Because of (2] Theorem 2.4.2], the d-tilted information in the rate-distortion problem is also defined irrespective of the choice of optimal 
test channels; the minimum and the maximum in HU Remark 9] are superfluous. See also [3] Remark in p. 69]. 

9 In contrast to the property of d-tilted information in [H Eq. (17)], (H only holds for w E supp(P^|^--^(*|rr, y)) instead of w E 

suppfPw*)- This stems from the fact that either log -T__—_ Q r log —r may be infinite, while distortion measures that may 

& Px\w*( x \ w ) & PY\w*(y\ w ) 

take infinity are excluded in cm. 

10 In fact, this regularity condition can be replaced by any regularity condition that guarantees the validity of {TO}. 
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we have 


dR(rt,r* 2 \P e ) 


d6t 




= £ 


dPg(x,y) 


d0. 


JXi Y i (x,y) + J2 p d x , y) djXe l <> } X: ^ 

Q — C UUi 

u ~s x.y 


0=£ 


sdjx e Y g (x,y) 

= 3x i y i {i) - Jx e Y s {m) + p d x > V) - qq. - 

x,y 

We now evaluate the second term as follows. By ( l54t and the assumption (156b . we have 

E[jx e Y s (Xt:,Y i )\= p d x >y) p w\x ( Y € ( w \ x >y)3x a Y B {x,y) 

Pw\x e Y e ( w \ x ’ y ) 

1 yj lu s' _ 

w,x,y L 

+ ^1,0 ( l°g 


0=£ 


w,x,y 


= £ p d x >y) p w\x ( Y S ( w \ x >y) 

i 


Pwi (w) 


P X e \wAx\w) 

where Xi t g is defined by (|22T> for Pg. Thus, we havd 1 1 


- r* ) + A? g log 


1 


p Y e \w*{y\w) 


Y P d x ’V) 


djx e Y e (x,y) 


x,y 


do, 


dE[, X ' Ye (Xt,Y 6 )] 


0=€ 


do, 


*=£ 


dE[logP^ lXeYe (W£\Xs,Yz)} 


dO, 


9E[log Pw+W)} 


e=t 


dXt 


80. 

dX 


o=t 




dOi 0 = £ 

<9E [log Px„ | w$ | W £)] 


+ 


2,0 


d0, 


H(Yt\W{) - ^ - A* >? 


= — E 

90, 


0 =£ 

^|x t y t (W?l^£.n) 


dO, 

9E[log iV 9 1VV* I)] 


0 =£ 




?=£ 


9 E 

\ p w*(W{)] 

0=£ 

i 


- A ^90- E 


Px^w^lW*) 


»=£ 


- A ^^ E 


0 =£ 

Pr d w s *(ni^£) 


»=£ 


= 0, 


(59) 

(60) 

(61) 

(62) 

(63) 

(64) 

(65) 

( 66 ) 

(67) 

( 68 ) 

(69) 

(70) 


where the third equality follows from H(X^\W£) = r* and H(Y^\W£) = r 2 . ■ 

IV. Coding Theorem 

In this section, we characterize the second-order region of the Gray-Wyner network. We first describe the 
statement, and then it will be proved in Sections IIV-AI and IIV-BI 

Theorem 1 For a given boundary point (r ( * ,r*. r 2 ) s Pgv(Pxy), suppose that the function R(r\ , r 2 \Pg) defined 
by (l20l > is twice differentiable with respect to (r\,r 2 ,0) around (r*, r 2 , O(Pxy)) and those second derivatives are 
bounded. Also, a regularity condition for Lemma [3 is satisfied. Then, we have 


Cw(e-,r*,rl,r* 2 ) = {(L 0 , L x , L 2 ) : L 0 + X\h + X* 2 L 2 > VlAyQ _1 (e)} 


(71) 


11 In the following calculation, the base of logarithm is e instead of 2, which is irrelevant to the final answer. 
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for 0 < e < 1, where A* is given by d22t and 

V*y :=V[ 3x y{X,Y)]. 


(72) 


A. Proof of Achievability 

In this section, we prove the achievability part of Theorem Q] For each type P XY £ Pn(A’ x y), we pick a 
conditional type P\y\xy 6 Tn(W \X x T), and then construct a code C n C 7^ such that, for every (x,y) G TJy, 
there exists w G C n satisfying ('tv. x. y) G Tff Yy ,. Basic strategy is the same as the covering lemma in the rate 
distortion (cf. m and E Chapter 9]). 


Lemma 4 Suppose that n > uq{\X\ 1 |T|j |W|). Given type Pxy € P n (X x T) and any test channel Pw\xy ( not 
necessarily conditional type), there exists a conditional type P\y\ XY satisfying 

\Pw\xy{ w \ x i v) ~ Pw\xy( w \ x > 2/)| — ~F> —7-7 (73) 

1 1 nPxY(x,y) 

for every (x,y) € supp(Tyy) and w G supp(P W |jYy(-|:E, y)), and a subset C n C Tff such that 

\Cn\ <exp{7r/(WAX,F) + (|T’||y||W|+4)log(n + l)} (74) 

and such that, for any (x,y) G 7there exists w £ C n satisfying (w. x. y) G 7^'- xY ■ 


Proof: By truncating the given test channel P\y\ X Y into conditional type, we can obtain conditional type 
P\v\xy satisfying (f73l ). Let Z mn = {Z i,..., Z mn } be i.i.d. and uniform over TJf. We will show 


(x.y)&Tf v 


1[( Zi,x,y) i Tffxv Vl<*< m n ] 


< 1, 


(75) 


which implies that there exists C n with \C n \ < m n satisfying the desired property. The lefthand side can be 
manipulated as 


E E 

(x,y)eTg v 


- E (*- 


l[(Zi,£c,y) ^ T^y VI < i < m„] 

\7w\xA x ’y)\ 


IT- 

I 


W' 


< Y exp 
(x,y)G 


|7£i*y(*>i/)I 


i n 


W 1 


(76) 

(77) 

(78) 
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where the last inequality follows from (1 — t) m < exp{— tm} for 0 < t < 1. Furthermore, it can be upper bounded 
as (cf. 0 Lemma 2.5]) 


5 Z exp 

(x,y)eTg v 


l^l xy( x ’V)\ 


w 


w 1 


< £ exp { - (n + 

(x,y)£Tg v 


< exp {nH{X, F)} exp { - (n + i)-\x\\y\\w\ 2 -ni(WAX,Y) mn ^ 

< exp {nlog \X\\y\} exp { - (n + 1)H*I Wl w l2-" 7 ^ A *’ ? )m„}. 
Thus, by taking m„ such that 


(79) 

(80) 

(81) 

(82) 


exp {nI{W A X, F) + (|*||}>||W| + 2) log(n + 1)} (83) 

<m n < exp {nI(W A X, F) + (|X||F||W| + 4) log(n + 1)}, (84) 

dH holds for sufficiently large n. ■ 


Corollary 1 Suppose that n > no(|X|, |F|, |W|). Given type Pxy S p n(X x F), there exists a conditional type 
p w\xy satisfying 

nH{X\W) <nr 1 + 2\X\\y\\W\logn, (85) 

nH(Y\W) <nr 2 + 2\X\\y\\W\logn, (86) 


and a subset C n C Tfy such that 


log \C n \ < nR(n,r 2 \PxY) + (3|X||F||W| +4)log(n + 1), 

and such that, for any ( x,y ) £ T£ v , there exists w £ C n satisfying (in. x. y) £ Tf. xy . 


(87) 


Proof: For the given type Pxy, we pick an optimal test channel P\y\XY that achieve Ii(r \. r 2 \Pxy)- Then, 
Lemma 0 implies that there exists conditional type P\y\xY satisfying (l73l) and a subset C„ satisfying the desired 
properties. From < 1 73b . we have 

|X||F||W| 


| p wxy ~ — 


( 88 ) 


where || • ||i is the variational distance. Thus, by the continuity of entropy functions (cf. 0 Lemma 2.7]), we have 


\H(X\W) - iT(X|W)| < | H(W, X) - H(W, X)| + | H{W) - H{W)\ 

/ \x\\y\\wu„„ \x\rn , \x\\y\m,_ \w\ 

i0 k Lvimiwi ' _ i0 & 


< 


n \x\\y\m 

n 

2|X||F||W| 


I.VIITIIVVI 


n 


■ log n. 


Similarly, we have 


\H(Y\W)-H(Y\W)\< 2Wm ^n 


(89) 

(90) 

(91) 

(92) 
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and 


| I{W A X, Y ) - I(W A X, Y)| < | H(W) - H{W)\ + \H(W, X , Y) - tf(W, X, F)| (93) 

2WI|W| 


< 


■ log n. 


(94) 


From Corollary [I] we can derive the following. 


Lemma 5 There exists a code <1»„ such that 


P e (4>„| Pxy) < Pr ^r 0 ,„ < R(ri t n,r 2 ,n\tx^Y«) S j, 


where tx n Y n is the joint type of ( X n ,Y n ), and 


r 0 „ := I tog !*,<”>! - (4W|yil>V| + 4)tog(„ + l) 
n n 


ri „ := ito 6 K»>i-«ffl^, 

n n 


r 2 , n ■ = 


(95) 

(96) 

(97) 

(98) 

n n 

Proof: We consider the following coding scheme. Upon observing (X n ,Y n ), the encoder first compute its 
joint type t x n Y n , and sends it to the decoders via the common channel by using Tj | T’ log(n + 1) bits. Then, for 
joint type Pxy = tx n Y n > the encoder findj^l the test channel P\y\xy and C„ that are specified by Corollary [T| 
where we set (ri,r 2 ) in the corollary to be (ri in ,r 2 n ) given by ((97} and ( l98t . respectively. If log |C n | exceeds 

log |XlQ n) | — |X||3^| log(n + 1), (99) 

then the system aborts and declares an error. Otherwise, the encoder send w £ C n satisfying (w, X n , Y n ) £ 7jj', vy 
to the decoders via the common channel. Since X n £ w (w), the encoder sends index of X n in w ( w ) t0 

the first decoder via the first private channel by using 


\og\T^ w {w)\<nH{X\W) 
< log \M^\ 


( 100 ) 

( 101 ) 


bits. Similarly, since Y n £ the encoder sends the index of Y n in 7j)'| w ( w ) to ^ le second decoder via 

the second private channel by using 


log|7? l vyMI < nH{Y\W) 

<iogiM n) i 


( 102 ) 

(103) 


bits. 


12 The encoder and the decoders agree on the choice of the test channel Pyy\XY an( ^ the subset C n for each joint type. 
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In the above coding scheme, the error occurs only when log \C n \ exceeds ( l99t . Thus, noting ( I%1 > and < 1 87 b . the 
error probability is upper bounded by 


Pr ^r 0 ,„ < R(ri, n ,r 2 , n \tx"Y”-)J , 


(104) 


which completes the proof. 

Now, we evaluate the right hand side of ( |95| >. Let (cf. Section lll-AI for the notations 0, and m) 


ICn ■■= | Pxy &V n (Xxy): \O^y) ~ Oi(P XY )\ < 

The following is an immediate consequence of Hoeffding’s inequality. 


logn 


VI < i < 171 — 1 r • 


(105) 


Proposition 2 We have 


To evaluate ( l95l ). we proceed as follows. We set 


Pi' (t x n Y n £ K-n ) < 


2 (m — 1 ) 


iio g |^)| = r; + A 


(106) 


(107) 


for * = 1,2 (we will specify I-Mq"' 1 I later). Since we assumed that the second order derivatives of R(r±, r 2 \Pe) with 
respect to (?"i,r2,0) are bounded around a neighbor of (r^, r 2 , O(Pxy)), and since — r* = + O 

when t.v»y” £ IC n , we can Taylor expand V 2 , n \tx n Y n ) as 


7?(ri n ,r 2) n|tA'"'V") < R{r\,r 2 \PxY) — A^ —^ — A 2~F= 

\Jn y/n 


+ ^2 — Oi{PxY))(jXY{i) — 3 xrH) + c 


logn 


i=l 


logn 


= R( r *i,r* 2 \PxY)-\^-\A 

yjn yjn 

+ 5^(tY»y» (*> y) ~ Pxy{x, u))jxy{x, y) + c 
x,y 

E , x . L\ , L /2 log n 

t x^{x,y)jxY{x,y) ^ X 1 —= - X + c- 

Jn Jn n 

x,y 

1 / -wy- r \ x +^1 \-k L 2 logn 

= - / Jxv(X l ,Y,) - X 1 —^= - X 2 —= + c - 

n 1 ' yn v'n n 

2=1 v v 


(108) 

(109) 

(HO) 

( 111 ) 

( 112 ) 

(113) 


for some constant c > 0 provided that n is sufficiently large, where the first inequality follows from Lemma [3] and 
the second equality follows from E[jxy{X, Y)\ = r 2 \Px Y )- Thus, we have 


Pr ^r 0j n < R{r lin ,r 2l n\tx*Y"-)^ 


< Pr tx"Y™ £ ICn, r 0 ,n < R{ri,n,r 2 ,n\tx™Y™) ) + Pr tX"Y" ^ 1C 


. _ ( ,*£i .* L 2 lyx logn A 2(m—1) 

< Pr ( ro n + — j= + X 2 —f= < — / Jxy{Xi, Y$) + c-) H «-• 

V vn y n n n J n z 

x v v 2=1 7 


(114) 

(115) 

(116) 
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Thus, if we set 


1 , luWl , , L 0 , (4|^||y||W|+4 + c)log(n+l) 

— log AIq = R{r 1 ,r 2 \PxY) H— 7 = H-, 

n \Jn n 


there exists a code <f>„ such that 
P 


(*n| Pxy) < (R(rlr* 2 \P XY ) + + A A < - ^ 

\ v 77, V 77, V 77, n / 


— Pr ( Lq + A* L\ + X 2 L 2 < 


2 (to — 1) 


Thus, if we set Lq,Li,L 2 so that 


£0 + A*£i + \ 2 L 2 > \/Vx5'Q 1 (e), 

by applying the central limit theorem, we have 

lim sup P e ($ n | P X y) < e - 

n—f 00 

Since this code also satisfies @-(d8|, we have shown (e, rj, r*, r|)-achievability of (Lq, Li, L 2 ). 


(117) 

(US) 

(119) 

( 120 ) 

( 121 ) 


B. Proof of Converse 

In this section, we prove the converse part of Theorem Q] We first derive a kind of strong converse bound when 
a code <f> n is applied to source (X n , Y" ) ~ /y^ for the uniform distribution Prgy on the type class for a fixed 
type Pxy- 


Lemma 6 Suppose that the correct probability satisfies 

Pc^nlP t?J > 2- 


( 122 ) 


for some positive number a n - Let /3 n be another positive number. Then there exists Pw\xy with |W| < |Af||3^| +2 
such that 


log |Yfg n) | > I(W A X,Y)~ 


|*|p>|log(n + l) 


{oi-n fin)-) 


-log|Yl^ n) | > H(X\W) - — — log|Af|, 
n n 

-\og\M^\ > H(Y\W) - - - 2~ n ^ log|3A|, 
n n 

where (X,Y) ~ P X y- 

Proof: We prove this lemma by using the perturbation approach used in (7), ( 8 ). Let 

v xy ■= \ (x,v) e Txy ■ ipi(<p 0 (x,y),cpi(x,y)) = x,f) 2 {}Po(x,y),tp 2 (x,y)) = y 


(123) 

(124) 

(125) 


be the set of correctly decodable sequences on 7TY- Let Qtv-- be a distribution on 7YY defined by 


Qr$Jx,y) = 


2 n(a„+£„ )p Ti ( X ,y) 


2 n(<*n+Mp T (Vxy) + (l - PtsA'Dxy)) 


(126) 


(127) 
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for (x,y) G T>xy an ^ 


QrnJx,y) = 


p TsAx,y) 


2 n(a n+0n )p r ^ i - D .. ) + (1 _ P T ^(Vxy)) 

for (x,y) (f: Pxy- Then, from (1122b . we have 

2 n Pn 

XY ) — 2 nj3 n y ^ * 

In other words, if we use the same code ff>„ to source (X n . Y n ) ~ Q'T' XY • we have 

Pe(^|Qr^)<2-”^. 

Furthermore, for every (a:, y) £ TJ' V , we have 

Qr^(x,y) < 2 n (“"+^)p r n_ ( x ,y) 

and 

Pt-» (x,y) 

O-T-n (x v) > _ Xf ' K y _ 

- y> - 2 n(a n +p n ) p T „_ (V X y) + 2 n ^+^)(l - Pt^( d xy)) 

= 2- n ^+Mp T n_( x ,y). 

Now, by a slight modification of the standard argument! we have 

-log|M n) |> -H(S 0 ) 
n n 

1 


> -I(S 0 AX n ,Y n ) 
n 

1 ” 

= -Y j I{S 0 f\X i ,Y i \X i -\Y i ~ 1 ) 

n z —* 


= -E 

n ^' 


i =1 L 


/(So.x-sy*- 1 ajsq.fo 


i n i r n 

= - V/^o.X*- 1 ,^- 1 AXi.FO- - Y / H(X i ,Y i )~H(X n ,Y n ) 


i= 1 


L i =1 


1 n 1 r n 

= - V /(Wi A Xi, Yi) - - V JTpQ, y 4 ) - H(X n , Y n ) 

n z —' n z —' 


i=l 

=/(iFjAXj.yjiJ)- 


■ i=l 


JT(Xj,Yj|J)--JT(.X: n ) Y n ) 


= J(Wj A -Xj, Yj|J) + /(J A Xj,Yj) - 




= I(J,WjAXj,Yj)- 


H(Xj,Yj)--H(X n ,Y n ) 

n 


(128) 

(129) 

(130) 

(131) 

(132) 

(133) 

(134) 

(135) 

(136) 

(137) 

(138) 

(139) 

(140) 

(141) 

(142) 


13 Note that all the information quantities are evaluated with respect to ( X n , Y n ) ~ » f° r exam pl e > (Xi, Yi) and (X z 1 , Y 1 *) may 

not be independent. 
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where So = ipo{X n ,Y n ), Wi = (So,X l 1 ,Y l 1 ), and J is the uniform random variable on {l,...,n} that is 
independent of all the other random variables 14 !. We also have 


-log|M n) | > ~H(S i) 

n n 

1 


> —H(Si\So) 

n 

> -J(Si AX n \S 0 ) 

n 

= -H(X n \S 0 )--H(X n \S 0 ,S 1 ) 


1 


> -H(X n \S 0 )- 


1 


log |*| 


= -Y / H(X i \S 0 ,X i ~ 1 )- 

T) z ' 


i= 1 


log |*| 


n 

1 


i=l 

= H(Xj\J, Wj) — 


- + 2~ n P 

n 


log |*| 


- + 2 " 

n 


log |*| 


(143) 

(144) 

(145) 

(146) 

(147) 

(148) 

(149) 

(150) 


where Si = ip\ (X n ,Y n ), and the forth inequality follows from the Fano inequality and (1 1 30b . Similarly, we have 

-log|Af^ n) | > H(Yj\J,Wj) - - + 2- n ^ log \y\ . (151) 

n \_n 

By the support lemma (cf. 0|), there exists Pw\XjYj such that |W| < 1*113^1 + 2 and 

I(J,WjAXj,Yj) = I(WAXj,Yj), (152) 

H(Xj\J,Wj) =H(Xj\W), (153) 

H(Yj\J,Wj) = H(Yj\W). (154) 

Now, we claim that the distribution PxjYj coincides with the type Pxy- Iu fact, for every fixed (x,y) S 7^'y, 
we have 


(X n ,Y n ) = (x,y) I = Pxy {a, b) 


Pr y(Xj,Yj) = (a,b) 
for every (a, b) £ * x y. Thus, we have 

PxjYj (a, b) = E Qr^(x,y)Pr^Xj,Yj) = (a,b) 


(x,y)eTgy 

= Pxy{ (1 - (')• 


(■ X n ,Y n ) = (x,y) 


(155) 

(156) 

(157) 


4 Note that J and ( Xj,Yj) may not be independent. 
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Here, by letting W as the random variable induced by the channel P w \XjYj from (X, Y) ~ PxY’ we h ave 


/(J, Wj A Xj, Yj) = I(W A X, Y), (158) 

H(Xj\J,Wj) = H(X\W), (159) 

H(Yj\J,Wj) = H(Y\W). (160) 

Finally, we evaluate the residual term in (1 1 42b . Since PxjYj = Pxy> we have 

H(Xj,Yj) = H(X,Y). (161) 


Furthermore, it is well known that (cf. 0) 

nH(X, Y) - |^| |y | log(n +1) < log \T$y\ < nH(X , Y). 
From ( 1131b and ( 1133b - we have 


log 


1 


QrsA x ,y) 


- log 171 


XY I 


— T Pn) 


for every (x,y) £ 77 -' . Thus, we have 


H(x n ,Y n )-\og\r; 


XY I 


< n(a n +/3 n ). 


By combining (1161b . (1 1 62b . and (1 1 64b . we have 


H(Xj,Yj) - -H(X n ,Y n ) 
n 


< 


H(Xj,Yj) - - log \Tj 


XY 


+ 




/ |A’p|log(n +1) , q \ 

<-h (an + Pn). 


Consequently, from (1 1 42b . (1 1 50b . (1151b . (1 1 58b - (l 1 60b . and (1 1 67b . we have the claim of the lemma. 
We now use Lemma [6] by setting 


r^n — Pn — 


logn 


(162) 

(163) 

(164) 

(165) 

(166) 
(167) 


(168) 


Lemma 7 For any code <!>„, it holds that 


P e ($ n | Px Y ) > Pr ^r 0 ,„ < R(ri, n ,r 2 , n \ix«Y") S j - K 


where t_\;»y» is the joint type of ( X n ,Y n ), 


ro, n ;= ^ log \M^\ 
ri, n :=ilog|A^l n) | 

T2,n ■= ^loglAfs 0 ! 


| |A-||y|log(n+l) 

n 


+ (ctn + Pn)i 


+ l +2 -n0 n log |^| 
n 

+ - + 2~ n ^ log |3^|, 
n 


(169) 


(170) 

(171) 

(172) 
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and a n and /3 n are set as (1 1 68b . 

Proof: Let r n = (ro iTl , ri jn , 7"2 in ). Lemma [6] implies that if 

r n ^ P-GU ( P Xy) 

then the correct probability satisfies 

Pc(* n | p r$J < 2 - 


Thus, we have 


Pe($„| P%y)= J2 p 'xY(Tfy)I\X<l\ l .\Pr^) 

PxYev n (xxy) 

> E p xy(T^y)( 1-2—) 

p XY eVn ( Xxy ') 
r n& ^-GW ( p x Y ) 

> £ 'ti-ffi) - f 

p xy€7>n(Xxy) 
r n 'R-QM ( p x Y ) 


(173) 

(174) 

(175) 

(176) 

(177) 


where the last inequality follows from the choice of a n . By denoting the type of (X n . Y n ) by tx n Y n , the first 
term of the above bound can be written as 

E P xy(Jxy) = Pl (r n i (178) 

eVr, (XX.y) ' ' 


p XY eV n ( ' Xxy ') 

i r ’n&' p -G\l( p XY) 


= Pr ^r 0 , n < R(r ltn , r 2 ,„|tx"Y")^ , 


(179) 


which completes the proof. ■ 

Now, we evaluate the first term of the right hand side of ( 1 1 69b . Suppose that for i = 0,1,2 satisfy 

for some (Lq, Li, L 2 ) satisfying 

Lq + \\L\ + X 2 L 2 < \/VxyQ 1 (^)- (180) 


Then, we can write 


- log l-A/1 ■ n) | =r* + -^L + Sim i = 0,1,2, 
n yjn 


(181) 
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for some <5,; >n = o(l/y/n). Thus, in the same manner as the achievability part, we havj^l 

Pr ^ 0 ,n < #(ri,n,r , 2 ,n|tx»l'")^ 

> Pr ^t X n F n G /C„, r 0j n < -R(ri i „,r 2l „|txnr',)^ 

/ L L 1 > n . \ 

> Pr ( t x ™Y n S JC n , R(rl, t^Pxy) + — 7= + 7= < — Y ^JxyjXj, Yi) - S n j 

V V n V™ J 

> Pr (R(rl,rZ\PxY) + xA + xA < ~ 'VjxY&uYi) - <0 - Pr Ay- t K r 

> Pr (R{rl,r* 2 \P XY ) + AJ^L + xA < ~ 

\ v n v n n i=1 / n 

for some S n = o(l/y/n). Thus, by the central limit theorem, we have 

lim inf P e ($„| P X y) > £ , 

n—t 00 

which implies that any (Lq, Li, L 2 ) satisfying (1 1 80b is not (e, rfi, r*, r^-achievable. 


(182) 

(183) 

(184) 

(185) 

(186) 

(187) 


V. On The Pangloss Plane 

In general, it is extremely difficult to compute the first-order region TZ^(Pxy), and so do the second-order 
region £ G w(£;fg,r*,rj). Nevertheless, to get some insight, let us consider the following tractable case. 


The region R-m(Pxy) is contained in the outer region characterized by three planes (cf. Fig. |2): 

r 0 + n+r 2 >H(X,Y), (188) 

r 0 + ri>H(X ), (189) 

r 0 +r 2 >H(Y). (190) 

The first plane is called the Pangloss plane in ( 6 ). Let 

H(Pxy) ■= {(ro.ri.ra) G Km(Pxy) :r 0 + r 1 +r 2 = H(X,Y)} (191) 

= {(I(W AX,Y),H(X\W),H(Y\W)) : |W| < \X\\y\+2, X -o-W -o-Y} (192) 


be the set of all achievable rate triplets on the Pangloss plane, where X — o— W — o— Y means (X, W. Y) form 
Markov chain. Although explicit characterization of PL(Pxy) is not clear in general, it is broader than the following 
triangular region 

conv{(7T(X,F),0,0), (H(Y), H(X\Y), 0), (H(X), 0, H(Y\X))}, (193) 

and the altitude of the lowermost points is r 0 = C\i(Pxy), where 

Cw(Pxy) ■= min {r 0 : 3n,r 2 s.t. (r 0 ,ri,r 2 ) G H{Pxy)} (194) 

= mm{l(W AI,F) : \W\ < \X\\y\, X-o-W-o-Y} (195) 


15 Note also that 7 q = R^r^r^Pxy)- 
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is Wyner’s common information |[38l (cf. Fig. [2}. 

When (ro,r*,r|) £ TL(Pxy), it holds that 

R(r\.r? 2 \P XY ) = H(X,Y) - r* - rj. (196) 


Thus, A= A£ = 1. Also, since (rj, r*, r 2 ) is achieved by an optimal test channel satisfying X —o— W* 
holds that 

l .A l 


, x , p w\xy{ w \ x iV) *. 
Jxy{x, y ) = log- - —-r-r-+ log 


= log 


Pw* (w) 

1 


Px\w*{x\w) 


— r i +A5 log 


p Y\w*(y\w) 


Pxy(x, y) 


-r i -r 0 . 


Thus, the second-order region is characterized as follows ^ 


Corollary 2 When (rj, r*, r|) is an strict inner point of 'H(Pxv) l 17 l it holds that 


£ GW ( e ;rS,rt,r^) = {(Lo,L ll L 2 ) :L 0 + Li+ j L 2 > VViryQ -1 ^)}, 

where V^y is given by 

Vxr =V[ 3 xy(X,Y)] 

1 


= V 


log- 


-y, it 

(197) 

(198) 


(199) 

( 200 ) 
( 201 ) 


P xy (X,Y)_ ' 

In fact, the sum constraint on the second-order rates in the above corollary coincides with the cooperative outer 
bound, where the two decoders cooperate. Thus, on the Pangloss plane, there is no sum-rate loss compared to 
cooperative decoding scheme up to the second-order, which is quite remarkable. However, it does not mean that the 
auxiliary random variable is not needed; the auxiliary random variable is needed to construct a code that achieve 
the optimal second-order region. 


VI. Discussion 

In this paper, we derived a characterization of the second-order region of the Gray-Wyner network. Apart from 
the interest on this network itself, there is another motivation to study this problem. As we mentioned earlier, the 
characterization of the first-order region of multi-terminal problems typically involve auxiliary random variables; 
involvement of auxiliary random variables is one of reasons that the second-order analysis of multi-terminal problems 
is difficult. Thus, the result of this paper is an important step toward extending the second-order analysis to multi¬ 
terminal problems. 

It seems that the next simple problems that involve auxiliary random variables are the coding problems with side- 
information (cf. If37l ). In contrast to the Gray-Wyner network, the coding problems with side-information involve 

16 It is apparent from {T%} that the second derivative of R(ri, 7 * 2 ! Pq) around (rj, rj, O(Pxy)) is bounded. Furthermore, instead of checking 
differentiability of test channels, we can directly differentiate jxY(%,y) in this case, and thus the validity of Lemma [3| is also guaranteed. 

17 Since we do not know an explicit form of Rgm(Pxy) outside P{Pxy)-> it is not clear if the regularity condition is satisfied or not on the 
boundary of 'H(Pxy)- 
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Fig. 2. A description of an outer region of TZgw(Pxy)- 

Markov chain structures on auxiliary random variables that stem from the distributed coding nature of the problems. 
Thus, the techniques used in this paper are not enough to solve these problems. However, we believe that the result 
in this paper at least gives some hints to tackle those problems. 
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